A real-space grid implementation of the Projector Augmented Wave method 
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A grid-based real-space implementation of the Projector Augmented Wave (PAW) method of 
P. E. Blochl [Phys. Rev. B 50, 17953 (1994)] for Density Functional Theory (DFT) calculations 
is presented. The use of uniform 3D real-space grids for representing wave functions, densities 
and potentials allows for flexible boundary conditions, efficient multigrid algorithms for solving 
Poisson and Kohn-Sham equations, and efficient parallelization using simple real-space domain- 
decomposition. We use the PAW method to perform all-electron calculations in the frozen core 
approximation, with smooth valence wave functions that can be represented on relatively coarse 
grids. We demonstrate the accuracy of the method by calculating the atomization energies of twenty 
small molecules, and the bulk modulus and lattice constants of bulk aluminum. We show that the 
approach in terms of computational efficiency is comparable to standard plane-wave methods, but 
the memory requirements are higher. 

PACS numbers: 71.15.D, 31.15.E 
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I. INTRODUCTION 

Density Functional TheorjiS (DFT) combined with 
the Generalized Gradient Approximation (GGA) for 
the exchange and correlation functional has become a 
popular method for studying materials and molecules 
at the atomic scale. Recently, there has been 
an increasing interest in using uniform real-space 
grids and finite-difference methods for doing DFT 
calculationa 3 ! 4 ! 5 ! 6 ! 7 ! 8 ! 9 ! 10 ! 11 ! 12 ! 13 ! 14 ! 15 : 16 . Real-space grids 
give an unbiased description of the wave functions and 
the quality of the description can easily be controlled by 
changing the grid-point density. Finite-difference opera- 
tors are used because the wave function values are given 
on grid points in real-space, and not in terms of a basis 
set. By doing all operations in real-space, paralleliza- 
tion can be done by simple domain decomposition 6,1718 . 
Furthermore, real-space methods can make use of multi- 
grid acceleration schemed for solving the Kohn-Sham 
equations^ and the Poisson equation. A further ad- 
vantage of real space methods is the possibility for im- 
posing localization constraints on the wave functions, 
which is the basis for linearly scaling electronic structure 
methods^ 7 " (order-iV methods). 

Today, one of the most used methods for performing 
DFT calculations is the pseudo potential method using 
periodic super-cells and plane-wave expansions for the 
pseudo wave- functions. This method shares with the 
grid-based methods the properties of unbiased represen- 
tation of the wave functions and simple control of the 
quality of a calculation (by changing the number of plane 
waves). However, there are three major difficulties with 
a plane-wave representation for the wave functions. 1) 
Working with spatially localized wave functions, which 
is important for order- N methods, is difficult with the 
extended nature of plane waves. 2) Not all operations in- 
volving the wave functions, densities and potentials can 
be done directly in the plane-wave representation, and 
Fourier transformations to and from real-space must be 



carried out. Transformations between real and recipro- 
cal spaces are highly non-local operations, and therefore 
difficult to parallelize. 3) Due to the periodicity of plane 
waves, the natural boundary conditions for a plane wave 
calculation is periodic boundary conditions. Although 
all three problems have been addresse d 2 ?: 21 ' 22 : 23 ' 24 , the 
suggested solutions are not as simple as for grid-based 
approaches where all three problems have simple solu- 
tions. 

An advantage of a plane-wave representation for the 
wave functions is its compactness. The memory footprint 
of a wave function is typically 10 times larger in a real- 
space grid representation compared to a plane-wave rep- 
resentation of similar accuracy. For this reason it is im- 
portant to use soft pseudo wave functions that can be ac- 
curately represented on coarse grids. To our knowledge, 
until now, all applications of grid-based electronic struc- 
ture calculations have used norm-conserving pseudopo- 
tentials. One way to get smoother pseudo wave functions 
is to relax the norm-conservation of the wave functions 
and use ultra-soft pseudopotential a 2 ^ 26 or the Projec- 
tor Augmented Wave (PAW) metho d 27 : 28 . We have de- 
cided to use the PAW method. The PAW method works 
with soft valence wave functions and, similar to the ultra- 
soft pseudopotential method, the wave functions need not 
be normalized. Contrary to the ultra-soft pseudopoten- 
tial method, the PAW method is an all-electron method 
within the frozen core approximation, giving access to 
the true wave functions and the full electron density. The 
PAW method has been implemented for plane waves by 
several groupa 27 i 28 ' 29 i 30 : 31 : 32 . 

We see the combination of real-space grid-based meth- 
ods and the PAW method as an important step towards 
enabling larger calculations at a level of accuracy which is 
essentially all-electron in nature. There is a clear trend 
in electronic structure theory towards larger and more 
complex systems as for example nanostructures, large 
(bio-)molecular complexes and extended defects in real 
materials. Systems which all quickly challenge present 
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day high accuracy DFT codes, that are typically limited 
to at most a few hundred atoms. The great potential of 
the method presented here lies in the parallelization of 
the real-space algorithms. This makes it possible to make 
use of massively parallel computers as has been demon- 
strated by several other groupsS^i^. In this paper, we 
focus on how to do accurate DFT calculations efficiently 
by using a real-space PAW method. We demonstrate the 
accuracy of our grid-based PAW calculations by show- 
ing that we are able to reproduce results for atomization 
energies from all-electron DFT calculations. This very 
stringent test shows that the methodology that we have 
developed is useful for real applications. 

The solution of Poisson's equation is straightforward 
using multigrid methods^ (no Fourier transformations 
required). Solving the Kohn-Sham equations using multi- 
grid methods is a much more difficult task: Keeping the 
different eigenstates separated and orthogonal to each 
other can be a problem, and representing the Hamilto- 
nian on the coarse grids can also be problematic. We have 
decided to use the techniques typically used in "state 
of the art" pseudopotential plane-wave calculations^ 3 , as 
they have been developed and improved over the the 
last few decades. For iteratively solving the Kohn-Sham 
equations, we use Pulay mixing techniques for obtain- 
ing the self-consistent densityS* 3 ^, subspace diagonaliza- 
tions, and the Residual Minimization Method 33,35 using 
preconditioning of the electronic gradients for iteratively 
improving the wave functions. The preconditioning oper- 
ation is a single multigrid V-cycle using only the kinetic 
energy operator as an approximate Hamiltonian 6 . 

In the following section, we will briefly summarize the 
PAW method. Then, in section HTT1 we will go through 
the details of our grid-based formulation of the PAW 
formalism. In section IIVI we describe how we solve 
the Kohn-Sham equations, and the evaluation of atomic 
forces is discussed in section^ Section lvTl describes gen- 
eralizations of the method to periodic systems with use 
of Brillouin zone sampling. In section IVIII we apply the 
methodology that we have developed to a number of ex- 
ample systems and discuss approximations necessary for 
realistic calculations. Finally, in section IVIIII we discuss 
the computational performance of our implementation. 



II. THE PROJECTOR AUGMENTED WAVE 
METHOD 

The notation we use is close to the one used by P. 
E. Blochl in his papers on the PAW metho d 2 ?' 2 ? . We 
have used Hartree atomic units (% = m = e = 1) and 
we write the equations for the case of a spin-paired and 
finite system of electrons. 

The PAW method is based on a transformation be- 
tween smooth pseudo wave functions, ip n , and the true 
all-electron Kohn-Sham wave functions, ip n [n is the band 
index). The core states of the atoms, </>"' corc , are frozen. 
Here a is an atom index and i is a combination of prin- 



cipal, angular momentum, and magnetic quantum num- 
bers respectively (n, £ and m). 

Given a smooth pseudo wave function, the correspond- 
ing all-electron wave function, which is orthogonal to the 
set of 0" ,corc orbitals, can be obtained through a linear 
transformation 

^(r)=f^ n (r). (1) 

The transformation operator, T, is given in terms of 
atom centered all-electron wave functions, 4>f(r), the cor- 
responding smooth partial waves, 0"(r), and projector 
functions, pf (r), as 

t=i+£Di#>-i#»<p?i- (2) 

a i 

The atom centered all-electron wave functions are taken 
from a calculation of a single atom with spherical symme- 
try: <j>f(r) — 0^(r)Yt(f), where the Y^s are real- valued 
spherical harmonics (L is a combined index for £ and to). 

A radial cutoff distance, r", defining the atomic aug- 
mentation sphere is chosen. This radius is similar to 
a cutoff radius for a pseudopotential. The larger the 
augmentation sphere, the smoother the pseudo wave 
functions, but overlap with neighboring augmentation 
spheres must be avoided. 

For all all-electron valence states smooth partial waves 
<f>i(r) — <^(r)Yi,(r) are constructed. The partial waves 
must match the corresponding all-electron waves for r > 
r" . In this way, the correction in parenthesis in Eq. @ 
is zero outside the augmentation spheres and we will 
have T = 1 in this region. Notice, that there are no 
norm-conservation requirements to meet when choosing 
the shape of 4>nt( r ) ms ide the augmentation sphere. 

Smooth projector functions must also be defined - 
one for each partial wave: pf(r) — p^(r)Yi(f). They 
must be localized inside the augmentation spheres and 
satisfy (pf ) = S ili2 , which for the radial part gives 

f" r 2 drp a nt (r)4> a n , e (r)=5 nn ,. (3) 

J o 

With this construction we have T(f)f(r) — <j>f(r). 

In principle, an infinite number of projectors and par- 
tial waves are required for the PAW method to be exact. 
For practical calculations, a high accuracy data set will 
need only one or two projector functions for each angu- 
lar momentum channel of importance. This is similar to 
an ultra-soft pseudopotential, where a comparable num- 
ber of projectors is needed. One partial wave is usually 
taken as the bound valence state, and additional waves 
can be taken from "excited states" — solutions to the 
radial Kohn-Sham equation at different non-eigenvalue 
energies. The construction of partial waves and projec- 
tor functions is described in appendix IXI Al 
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A. PAW densities 

/From the atomic frozen core electron density, n®(r), 
a smooth core electron density, n"(r), is constructed. 
It must be identical to n"(r) for r > r". There is no 
norm-conservation requirement to meet when choosing 
the shape of n®(r) inside the augmentation sphere. 

The pseudo electron density has contributions from the 
wave functions and from the atom centered smooth core 
electron densities: 

»(r) = fn\Mr)\ 2 + J2 ftc(k - (4) 

n a 

where the / n 's are occupation numbers and R° is the 
position of atom a. 

An atomic density matrix (see Eq. (22) in Ref. 12^) is 
defined as 

^=£<&I&>/»$»|P&> (5) 

n 

where 

Pni = (Pi\4>n) = J dvp1{v - R a )V^(r) (6) 

The PAW formalism defines atom centered all-electron 
and pseudo electron densities as 

n a (v) = J2 DU^l (r) + n»(r) (7) 

and 

n a (r) = J2 KJt M + *TO> (8) 

respectively. By construction, n a (r — R a ) is identical 
to the all-electron density, n(r), for |r — R a | < r? and 
fi a (r - R a ) = ra(r) for |r — R a | < r° (see Ref. ||3 for 
details). Therefore, the true all-electron density can be 
obtained from the pseudo electron density: 

n(r) = n(r) + ^K(r - R Q ) - n a {v - R Q )]. (9) 

a 

Again, the correction is zero outside the augmentation 
spheres. 

A neutral charge density, p(r), is obtained by adding 
compensation charges, Z a (r), inside the augmentation 
spheres to the pseudo electron density. These charges 
compensate for the lack of norm-conservation and for the 
nuclear charge: 

/ 5(r)=n(r)+]Tz a (r-R a ). (10) 

a 

Using localized functions g£(r) = §|(r)Yk(r) normal- 
ized as 

j dvr l ~gl{v)Y L (t)=l, (11) 



the compensation charges are constructed with electro- 
static multipole moments Q a L : 

Z a (r) = Y,Ql~g a L(r)- (12) 

L 

The values to be used for the electrostatic multipole 
moments, Q1, are found by requiring the pseudo charge 
density, h a + Z a , to have the same electrostatic multi- 
pole moments as the all-electron charge density, n a + Z a , 
where Z a (r) = —Z a S(r) is the nuclear charge density 
{Z a is the atomic number). This requirement can be 
expressed as 

J drr e [n a {r) + Z a (r) - n a {r) - Z a {r)]Y L (r) = 0. (13) 
Inserting Eqs. 0, © and JT2J we get: 

Ql = J2^ 2 Dh 2 +^m, (14) 

where the constants Af^ 4 and A a are given by: 
and 

A a = [ drY 0Q (v)[-Z a S(r) + n a c (r) - n«(r)]. (16) 



B. The PAW total energy 

The PAW total energy is a function of the pseudo wave 
functions, "0 n (r), and the occupation numbers, /„. The 
energy can be divided into a "soft" contribution, E, and 
corrections for each atom (see Ref . l27l and 128^ : 

E = E + ^{E a - E a ). (17) 

a 

The "soft" energy contribution is: 

E = J2fn [ dr^(r)(-iV 2 )^(r) 
+ \J drv H (r)p(r) + E xc [n(r)} 
+ [ drn(r)^ a (|r-R Q |), (18) 

J a 

where w H (r) is the pseudo Hartree potential, satisfy- 
ing Poisson's equation V 2 w H = —4irp and i? xc is an 
exchange-correlation functional. Finally, v a (r) is an ar- 
bitrary localized potential vanishing for r > r". The 
soft energy contribution, E, is to be evaluated on three 
dimensional grids in real space. 
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The atomic corrections to the energy {E a — E a ) are 
given by: 



E a 



core „ 

j dr^ c ° rc (r)(-IV 2 )#' COT0 (r) 

dr^(r)(-iV 2 )C 2 (r) 
,K(r) + Z a (r)][n a (r') + Z a (r')} 



YD a - 



dr J dt 

E xc [n a (r)} 



(19) 



and 

E a = 



E 

»1»2 



_ D a 



dr / dr' 



dr^(r)(-iV 2 )^ 2 (r) 
,[n Q (r) + Z Q (r)][n a (r') 



Z a (r')] 



|r — r 

+ £ xc [n°(r)] + / drn a (rK(r). 



(20) 



The energy contributions for E a and E a , are evaluated 
on radial grids inside the augmentation spheres. 

By using Eqs. Q, iJS), JT^Jl and lfT4^l we can reduce the 
atomic correction, E a — E a , to a function of Df , : 

' ' lll2 



£ a - E a =A a 



/ j «1«2 >1«2 



AE2 c ({D? ii2 }). 



D a ■ 



(21) 



The constants A a , Bf ■ and Cf ,■ ,■ ,■ are evaluated in the 

————— L l t 2 L l L 2 L 3 ''A. 

appendix 1X1 CI 

The last term is an exchange-correlation correction: 



AE a xc ({Dl t2 }) = E xc [n a (r)] - E xc [n a (v)}, 



(22) 



which is a function of Df^ through Eqs. Q and JSJ. 
For local and semi-local exchange-correlation function- 
als, AE% c ({Df ii2 }) is written as an integration inside 
the augmentation sphere. There are several possibilities 
for the evaluation of this term. We use radial integra- 
tion along lines from the center to a number of points 
distributed evenly on the surface of the augmentation 
sphere 29,36 . We find that this approach is the simplest 
for GGA functionals 37 . Alternatively, one can expand 
the atomic densities in spherical harmonics, as described 
in Ref. |2?] or use grid free approaches 3 ^ 3 ^. 



III. UNIFORM 3D GRIDS 

In this section we give the details of our real-space grid- 
based implementation of the PAW method. From now 
on, wave functions, electron densities and potentials are 
represented on three-dimensional uniform grids in real- 
space. 



A. Localized functions and the double grid 
technique 

In a grid representation, integrals over space are turned 
into sums over grid points. In the PAW method we of- 
ten need to calculate the integral of a localized function, 
centered on an atom, multiplied by a function extended 
over all of space. As an example, let us take a projector 
function, pf(r— R a ), centered on atom a at position R a , 
multiplied by an extended wave function, tp n (r). 

In the following, we use the index G to index the grid 
points used for the wave functions. Transforming the 
integral to a sum over grid points, G, with V c being the 
volume per grid point, we get: 



pa 



(23) 



where V'nG = "0n( r G) and r^ is the position of grid point 
G (only the grid points in the localized region around 
atom a needs to be summed over). For pf G we could 
use pf{rG — R a ). However, this is not accurate enough, 
unless we use a very fine grid, which would compromise 
efficiency Instead we use the elegant double grid tech- 
nique of Ono and Hirose^ -. Here, the extended func- 
tion is interpolated to a finer grid with grid points /: 
Vw = Y^G^fG^nG- The interpolation operator I fa 
takes the wave function from a coarse grid to a fine grid. 
Typically, a cubic interpolation is used and the fine grid 
has five times more points in each direction as the coarse 
gridi^°«. The localized projector function is evaluated 
on a fine grid as — Pfi^f — R a ). With v being the 
volume per fine grid point we get a more accurate sum: 

f / G 

= V c J2l(v/V c )J2 I fGP?f]lPnG- (24) 
G f 

Comparing the rightmost expression with Eq. (|23|l . we 
identify the term in square brackets as the more accurate 
expression for p® G : 



pIg 



V \ ^ 



Vc f 



(25) 



This is equivalent to a restriction operation taking 
the localized function from the fine temporary grid to 
the coarse grid (restriction is the opposite of interpola- 
tion) . Notice that in Eq. I|23|) , the sum is over coarse grid 
points, and no actual interpolation of the extended func- 
tion needs to be done. The evaluation of the localized 
function on the temporary grid is a relatively inexpen- 
sive operation, with an operation count which for each 
atom is independent of the system size. 

We use the double grid technique to transfer localized 
functions, such as pf (r), n^(r) and v a (r), to values on 
real-space grids (pf G , n° G and Vg). 
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B. Real-space grid formulation of the PAW method 

The formulas for densities, potentials, and energies, 
given in section [HJ must be translated to a discretized 
form for use with a discrete representation of wave func- 
tions, densities and potentials. The pseudo electron den- 
sity, Eq. @, is calculated on a coarse grid as 

h G ^J2f^\ 2 +J2 h cG- ( 26 ) 

n a 

The smooth atomic core electron densities can be cho- 
sen very soft, so that they can be added to the coarse 
grid. 

The pseudo electron density on the coarse grid is in- 
terpolated to a finer grid (grid points indexed by g): 

"■g = Yl I aGna- (27) 

G 

We use a cubic interpolation for I g Q 1 and the fine grid 
has twice as many grid points as the coarse grid in each 
direction. From the pseudo electron density on the fine 
grid, one can obtain the neutral charge density as [see 
Eq. 

P 3 =n g + J2 2 9- ( 28 ) 

a 

Using our grid representation for wave functions, den- 
sities and potentials, we get for E: 

e = E/«^E^E(4^)^ 

n G G' 

+ ^/E^+£xc({n ff },Vf) 
g 

+ ^/E^E^< (29) 

g a 

where V c and Vf are the volumes per grid point for the 
coarse wave function grids and the fine density and po- 
tential grids respectively, and Lqqi is a central finite dif- 
ference representation of the Laplacian. The discretiza- 
tion that we use for the Laplacian, uses a total of twelve 
neighbor points, giving an error of the order of h , where 

1 /3 

h = V c is the grid spacing. 

The pseudo Hartree potential is found by solving 
Poisson's equation, V 2 5 H = — 47T/5, using a discretization 
for the Laplacian: 

^C gg ,vf, = -^p g . (30) 
g' 

This equation is solved using the multigrid technique 
pioneered by Brandt^. Solving the Poisson equation it- 
eratively on the finest grid will quickly reduce the short 
wavelength errors, but errors with larger wavelengths 
compared to the grid spacing are reduced only slowly. 
The multigrid technique introduces a series of coarser 



grids where the long wavelength errors can be effectively 
reduced. 

We use a Mehrstellen discretization^, where C gg i — 
(B _1 A) gs / is expressed in terms of two short-ranged fi- 
nite difference operators A gg i and B gg i : 

J2 A 9 9 ' i l i '^-^J2 B sg'Pg'- ( 31 ) 

g' g' 

For the coarse grids used in the multigrid V-cycle, sim- 
ple nearest neighbor central finite-difference Laplacians 
are used. 



C. Soft compensation charges 

Adding the compensation charges, Z a , to the pseudo 
electron density [Eqs. IjlOjl and (|28[l ] will require a very 
fine density grid in order to get an accurate descrip- 
tion of the charge. The problem is that the compen- 
sation charges must be localized inside the augmentation 
spheres: <?£(r) = for r > r". Blochl has described a 
method for plane- wave basis sets2£, that allows the use of 
softer compensation charges extending outside the aug- 
mentation spheres. We use the same method for our 
grid-based approach. A cutoff radius f" larger than r" is 
chosen. Soft compensation charges with the same electro- 
static multipole moments as the localized compensation 
charges are constructed: 

Z a (r) = Y,Q a L9 a L(r), (32) 

L 

where g%{v) is a soft function localized within r < r°. 
The soft function, <?£(r), is normalized in the same way 
as <?£(r) — see Eq. <(TT|) . 

Equation l|28|l must now be replaced by: 

P g = n g + Y,Z« = n g + Y,Y,Q a L9 a Lg> ( 33 ) 

a ah 

and a correction must be added to E. The correction is 
(see Ref.i3): 

J drn(r) ^ v a (r - R Q ) + ^ U aa ' , (34) 

a aa' 

where v a (r) — J2l Ql^l( r )^ anc ^ 

m-f ^ 

The first term in Eq. I|34|l is evaluated on the grid 
as VfTlg^gTlaVg- The transformation of the local- 
ized potential, w a (r), and the localized compensation 
charge, Z a (r), (both vanishing for r > f") to values at 
grid points, v g and Z^, is done using the double grid 



6 



technique^. The last term in Eq. (|34[1 is a pair potential 
with range r° + f° : 



aa 1 



jjaa 



dr / dr' 



, t Z a (r-R a )Z a (r'-R CI ) 



r — r' 



Z a (r-R a )Z a (r'-R a 



LL' 



(36) 



where 



V£l' = I dr I dv 



, /.g£(r-R a ).g£,(r'-R a ) 



\ |r — r' 

g£(r-R a )g£',(r' -K a ') \ 
Ir-r'l 



(37) 



The pair potential terms, V££,, are functions of the dif- 
ference vectors R a — R a . 



D. Orthogonality 

The orthogonality constraint of the all-electron wave 
functions, (i/VilVv) = S nn ', can be expressed in terms of 
the pseudo wave functions^ as (VvJ0|Vv) = ^n', where 
the PAW overlap operator O is non-local: 



with 



0? ll2 = I dr[<(r)^(r)-^(r)^(r)] 



(38) 



4^Ag 0iii2 . 



The discretised overlap operator looks like: 
GG ' = 5 GG , + K^^^ff^iAff. 



(39) 



(40) 



and the orthogonality constraint of the pseudo wave func- 
tions can be expressed as 



v c Y,r nG OG G 4 



n'G> 



(41) 



GG' 



IV. FINDING THE GROUND STATE 

In order to find the electronic ground state it is neces- 
sary to calculate the derivatives of the total energy with 
respect to the wave function values. This "electronic gra- 
dient" can be expressed in terms of a Hamiltonian Hgc '■ 



1 dE 



= fn ^ HGG'j>nG' 



(42) 



The Hamiltonian is given as a sum of the kinetic energy 
operator and the local and non-local parts of the effective 
potential: 

H GG > = -h L GG> + v g Sgg' + V c J2zZpIgK^pIg'- 

a i 1 i 2 

Explicit formulas for the local effective potential, %, 
and the "atomic" Hamiltonian, iff.,-., are given in the 
appendix IXIDI 

The set of orthonormalizcd ground-state wave func- 
tions that diagonalize the Hamiltonian matrix H nn i = 
V c Y^GG' 1 PnG^GG''>Pn'G', must satisfy the Kohn-Sham 
equations: 



GG' 



e n OGG')^PnC 



0. 



(44) 



A. Residual minimization method and Pulay 
mixing 

In order to locate the self-consistent ground state, we 
use the Residual Minimization Method of Wood and 
Zunger— (see also Ref. 113) ■ The residuals are calculated 
as 



RnG = E^ GG " ~ £nOGG>)lpnG> 



(45) 



where e„ is the current estimate of the eigenvalue of the 
n'th band. Using a preconditioning operator P (to be 
discussed later), we can improve the wave function by 
taking a step along the direction of the preconditioned 
residual: ip n G + XPR n G (A is the step length). The opti- 
mal step length is found by minimizing the norm of the 
residual for the new guess: 

Kg = Eg>( h gg' - e„O GG 0(i G ' + APiW) 

= R nG + \Y,G'( H GG> - e n O G G')PRnG>- (46) 

Finding the optimal value for A amounts to finding the 
minimum of a second order polynomial in A. Having 
found the optimal step length for the first step, we do 
the actual update of the wave function by taking an ad- 
ditional step using the same step length in the direction 
of the preconditioned residual R' nG : 



i G «- 4>nG + XPRnG + APi^ 



(47) 



G 



For updating one wave function, one must apply the 
Hamiltonian twice, and two preconditioning operations 
are required. 

If we were to take steps along the residual (and not 
along the preconditioned residual), we would need very 
many iterations in order to converge. The problem is that 
the residual vector is not necessarily parallel to the error 
vector (which we don't know). The purpose of precon- 
ditioning is to produce a direction that more accurately 
represents the error vector—. 
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We would get the optimal preconditioned residual 
R n = PR n by solving (H — e n O)R n — R n . Instead of 
solving this equation exactly, we solve approximately the 
simpler equation —iV 2 i?, i = R n . This is done using one 
multigrid V-cycle, where a nearest neighbor discretiza- 
tion is used for the Laplacian on the coarse grids 6 . 

When all wave functions have been updated, the wave 
functions are othonormalized, and the density is up- 
dated. From the new density, a new Hamiltonian is 
generated (vq and Hf ii2 ). Finally, a subspace diago- 
nalization is performed, and the next iteration towards 
self-consistency can begin. For each iteration a new in- 
put density is estimated using Pulay mixing^i. Typically 
three old densities are used. The atomic density matri- 
ces, D° ij , are mixed as well. We start the iterations from 
a good guess at the wave function: A linear combination 
of atomic orbitals. 



V. FORCES 

The atomic force acting on an atom is defined as 
dE 



■pa _ 



9Ka h \WnG dRa C ' C ' 
dE 



V c J2 fntn J2 °GG' I ^£ + C.C. 



n GG' 



(48) 

In the last line, we have used Eqs. {H} and ijlijl. The 
variation of the wave function corresponding to a varia- 
tion in the position can be found from Eq. (|41|) : 

d ^J2^G°GG4n'G' = 0. (49) 



GG' 



Inserting into Eq. ijHSjl. we get: 



n GG' 



dhtr 



G g L 

x-^ dv" dvf n 

9 L 

n iii 2 

xJ2^G^-(p^r+c.c] 

G 

rl\raa' 



LL' 



(50) 



As an example, we show in Fig. ^ the force along the 
bond of a CO molecule calculated using the analytical 
expression above. Fitting a third order polynomial to 
the energies and taking the negative of the derivative 
with respect to the bond length, is seen to give exactly 
the same force. 




1.00 1.05 1.10 1.15 
bond length (A) 

FIG. 1: Energy and force for a CO molecule at different bond 
lengths calculated with h = 0.2 A. Bottom: The circles show 
the calculated energies and the curve shows a third-order 
polynomial fit. Top: The circles show the calculated forces 
according to Eq. I5UH . The curve is minus the derivative of 
the third order polynomial fit to the energies. 



VI. GENERALIZATIONS 

We have implemented the algorithms introduced 
above, and in addition we have made two extensions: 
1) treatment of spin-polarized systems and 2) treatment 
of periodic systems using Brillouin zone sampling. The 
first extension is straightforward. When k-points are in- 
troduced in order to treat periodic systems, we can work 
directly with the wave functions and use Bloch boundary 
conditions^: 



k r- 



R) 



ikR 



V'nk(r), 



(51) 



where R is any Bravais lattice vector. This is different 
from the plane wave approach, where the periodic ba- 
sis set forces one to work with the periodic part of the 
wave function and the Hamiltonian becomes k-point de- 
pendent. In our case, the boundary conditions become 
k-point dependent. 

It is only necessary to work with the k-points in the 
irreducible part of the Brillouin zone. Each k-point will 
have a specific weight, and densities, atomic density ma- 
trices, and forces should be appropriately symmetrized. 

For evaluating the GGA exchange-correlation energy 
and potential, we use a finite difference operator for 
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calculating the gradient of the density. The exchange- 
correlation potential is calculated as the exact derivative 
of the discretised exchange-correlation energy with re- 
spect to fig (similar in spirit to the method of White and 
Bird 4 ^ used for plane- wave basis sets). 



VII. APPLICATIONS 

The first application of the algorithms described 
here is the calculation of atomization energies for the 
twenty small molecules listed in table Q] using the 
PBE 44 exchange-correlation functional. The augmen- 
tation sphere radii used and the number of projectors 
used are shown in table [D] The choice of the aug- 
mentation sphere radii is a compromise between smooth 
pseudo wave functions and a low number of projector 
functions: A larger radius will allow us to have smoother 
pseudo wave functions, but more projector functions will 
be needed for high accuracy. Furthermore, the radius 
is limited by the requirement that the overlap between 
neighboring augmentation spheres should be small. The 
radii we have chosen, will give slight overlaps in some of 
the molecular calculations. 



TABLE I: PBE and experimental atomization energies (ex- 
perimental geometries are used—, and zero point vibration 
energy has been removed). The ground states of C, O, F, 
P and CI are found to be non-spherical—. All-electron and 
experimental numbers are taken from Refs.[4(| and l47l 



Molecule 


PBE 

PAW all-electron^ 


all-electroi 


Experiment 46 


H 2 


4.52 


4.54 


4.53 


4.75 


LiH 


2.32 


2.32 


2.32 


2.51 


CH 4 


18.17 


18.20 


18.18 


18.18 


NH 3 


13.03 


13.08 


13.05 


12.90 


OH 


4.76 


4.76 


4.75 


4.61 


H 2 


10.14 


10.16 


10.14 


10.07 


HF 


6.27 


6.16 


6.14 


6.11 


Li2 


0.89 


0.86 


0.85 


1.06 


LiF 


6.16 


6.01 


6.05 


6.02 


Be 2 


0.35 


0.42 


0.41 


0.13 


C2H2 


17.95 


17.99 


17.91 


17.58 


C2H4 


24.75 


24.78 


24.73 


24.40 


HCN 


14.10 


14.14 


14.07 


13.53 


CO 


11.58 


11.66 


11.60 


11.24 


N 2 


10.41 


10.55 


10.46 


9.91 


NO 


7.38 


7.45 


7.36 


6.63 


O2 


6.27 


6.23 


6.14 


5.23 


F 2 


2.28 


2.32 


2.25 


1.67 


F 2 


5.18 


5.25 


5.08 


5.09 


Cl 2 


2.83 


2.82 


2.74 


2.52 



For the second row atoms, the Is orbital is treated as a 
core state and frozen, and for the third row atoms, the Is, 



2s and 2p orbitals are frozen. For hydrogen, lithium, and 
beryllium, we use two s-projectors and one p-projector, 
and for the rest of the atoms two s-projectors, two p- 
projectors and one d-projector are used. The compensa- 
tion charges were taken to be spherical. We calculate the 
atomic exchange-correlation correction energy, Eq. I|22[l. 
using 49 line integrations in each sphere*— All calcula- 
tions are done using periodic boundary conditions. 

With these approximations we get excellent agreement 
with full all-electron calculations (see Table QJ. The av- 
erage and maximum differences between our PAW atom- 
ization energies and the all-electron calculations of Kurth 
et aZ— i are 0.05 eV and 0.15 eV respectively, and com- 
paring with the all-electron calculations of Zhang et a/4£, 
we get an average difference of 0.05 eV and a maximum 
difference of 0.13 eV (the two sets of all-electron calcula- 
tions differ by 0.05 eV in average and 0.17 as maximum). 
We find that all atomization energies are converged to 
within 0.03 eV/atom at a grid spacing of 0.1875 A. 

Interestingly, we find the convergence of total energies 
with respect to grid spacing to be very systematic. Fig- 
urel^Jshows the atomization energy of nitrogen as a func- 
tion of the fourth power of the grid spacing. It is seen 
that for small h, all points fall exactly on a straight line, 
which allows us to extrapolate energies to the limit of 
an infinitely dense grid (h = 0). The PAW numbers pre- 
sented in TableHJhave been extrapolated to h = 0. A sim- 
ilarly transparent convergence of DFT calculations was 
recently observed by Daykov et al~ for wavelet-based 
calculations. A quartic convergence is to be expected, 
because all approximations are accurate to order at least 
h 3 . 

10.42 



> 

e? 

c 



10.41 



10.40 




0.0000 



0.0001 

4 ° 4 
h (A) 



0.0002 



FIG. 2: Atomization energy of a nitrogen molecule as a func- 
tion of h 4 . The inset show the atomization energy as a func- 
tion of h. 



Figure shows the variation of the energy as a hydro- 
gen atom is displaced from one grid point to a neigh- 
boring grid point. Ideally, there should be no variation 
(we are using periodic boundary conditions). In practice, 
we have to make sure that this energy variation and the 
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TABLE II: Augmentation sphere radii in atomic units and number of projectors. 



atom 


H 


Li 


Be 


C 


N 


O 


F 


Al 


Si 


P 


CI 


rl [Bohr] 


0.9 


1.5 


1.5 


1.0 


1.1 


1.2 


1.2 


2.0 


2.0 


2.0 


1.5 


number of projectors 


s 2 P 


s 2 P 


s 2 P 


2 2 j 

s p d 


2 2 j 

s p d 


2 2 j 

spa 


2 2 j 

spa 


2 2 j 

spa 


s 2 p 2 d 


2 2 j 

spa 


2 2 j 

spa 



corresponding forces are acceptably small. For hydro- 
gen, the variation is below 0.15 meV (full line in Fig. - 
The energy varies periodically with period h. Notice that 
there is also a modulation of the energy with a period of 
h/b. This stems from the Ono-Hirose restriction of the 
projector functions: the localized projector functions are 
evaluated on a fine grid with a grid spacing of h/5 and 
then restricted to a coarse grid with the same grid den- 
sity as the wave functions. The oscillations give rise to 
forces up to 0.006 eV/A in magnitude, which is accept- 
able for most applications. The forces can be reduced 
further either by using a finer grid for the wave functions, 
or by using a finer grid for the Ono-Hirose restriction of 
the projector functions. If the projectors are evaluated 
directly on the coarse grid, then the variation of the en- 
ergy is 55 meV — more than two orders of magnitude 
larger (dashed line in Fig. [31). This clearly demonstrates 
the importance of the Ono-Hirose restriction. 



0.8 
0.6 
0.4 



| 0.2 
X 0.0 

60 

1-0.2 
-0.4 
-0.6 



Ono-Hirose 
raw x 0.01 



0.00 0.05 0.10 0.15 
displacement (A) 



0.20 



FIG. 3: Energy variation as a hydrogen atom is displaced 
from one grid point to a neighboring grid point for h = 0.2 A. 
The full and dashed curves show the result with and without 
using the double-grid technique (notice that the dashed curve 
has been multiplied by 0.01). 



For the bulk aluminum calculation (LDA 49 and PBE 44 
results are shown in Tabic ITTT|l . we use a cubic unit cell 
containing four aluminum atoms, and we use 10 x 10 x 10 
k-points. Again, we get good agreement with exact all- 
electron calculations for both the lattice constant and 
the bulk modulus (the bulk modulus is calculated at the 
theoretical lattice constant). 



TABLE III: Lattice constant and Bulk modulus for fee bulk 
aluminum. All-electron and experimental numbers are taken 
from Ref-ES 

XC a (A) B (GPa) 

PAW all-electron PAW all-electron 



LDA 3.987 
PBE 4.043 



3.983 
4.039 



83.6 
77.7 



84.0 
77.3 



experiment 



4.050 



77.3 



VIII. PERFORMANCE 



We have compared the performance of the real- 
space code with a highly optimized ultra-soft plane- wave 
cod ofrP'^ 1 ^ 2 . The ground state of 64 Si atoms in the di- 
amond structure was found using both codes. For the 
plane wave calculation, plane waves with kinetic energies 
up to 100 eV were used and the size of the real-space grid 
for fast Fourier transforms was 36 x 36 x 36 points. The 
same grid size was used for the real-space code (h = 0.30 
A). After finding the electronic ground state, one atom 
was displaced by 0.1 A, and the time for converging to 
the new ground state was measured. The measured times 
were 21 and 26 minutes for the plane wave code and real- 
space code respectively on a Pentium-4 2.6 GHz Linux 
machine. We have estimated the degree of convergence 
of the two codes by calculating the cohesive energy of 
silicon. The cohesive energy calculated with the plane 
wave code and a plane wave cutoff of 100 eV, was con- 
verged to within 40 meV of the fully converged value. 
With the real-space code and h = 0.30 A, the cohesive 
energy was converged to within 3 meV of the h = value. 
This result and similar results for other calculations we 
have performed seem to indicate that the real-space code 
obtains a somewhat better convergence at the same grid 
spacing as the plane-wave code. Furthermore, the real- 
space code can, most likely, take advantage of a number 
of improvements, such as algorithmic improvements and 
optimizations of floating point operations and memory 
access. The plane wave code has already benefited from 
optimizations of this sort. 

Regarding the memory requirements, the plane- wave 
code is clearly more economic. The memory required to 
store one wave function for the 64 atom silicon system 
is 36 3 = 46656 floating point numbers for the real-space 
code and 2897 floating point numbers 53 for the plane- 
wave code. 
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IX. DISCUSSION 

Using the techniques described in this article, an im- 
portant step in every electronic structure calculation, 
namely the application of the Hamiltonian to all wave 
functions, can be done in 0(N 2 ) operations. This is 
a more optimal scaling, than the 0(N 2 log N) scaling 
that can be achieved with plane-wave basis sets and 
fast Fourier transforms. Operations such as orthogonal- 
ization and subspace diagonalization of the wave func- 
tions scale as 0(N 3 ). Luckily, the prefactors for these 
0(N 3 ) operations are very low, so that very large sys- 
tem sizes are required before the 0(N 3 ) terms become 
the bottleneck 5 ^. In the limit where 0(N 3 ) terms start 
to dominate, plane wave methods will in principle have 
an advantage, because the number of plane wave coeffi- 
cients will typically be less than the number of grid points 
used in real-space grid-based calculations. However, we 
believe, that for those very large systems, efficient paral- 
lelization on massively parallel computers and the use of 
O(N) methods is crucial. 

Currently, we have a single processor implementation 
of our PAW real-space algorithms 55 . This limits us to 
study rather small systems. Obviously a parallelization 
using real-space domain decomposition is needed. For 
the small systems that we have studied so far, charge 
sloshing is less of a problem, but it may become a prob- 
lem when we move on to larger systems. It will therefore 
be necessary to improve on our mixing of the density. 
A preconditioning, such as that proposed by Kerker— , 
that will damp the long wavelength changes to the den- 
sity should be considered. Another improvement would 
be to use a special metric, that weights long wavelength 
errors higher than short wavelength errors, for estimat- 
ing the norm of the difference between input and output 
densities in the Pulay method^. These improvements to 
the density mixing are easily implemented in reciprocal 
space, but may be challenging to do in real-space. 
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A. Construction of partial waves and projector 
functions 



A DFT calculation for the atom is performed using 
radial grids for the wave functions, densities and po- 
tentials. The radial Kohn-Sham equation gives us a 
set of radial all-electron wave functions, normalized as 
J r 2 dr[(j)n t (r)] 2 = 1. The core states are only used for 
constructing a frozen core electron density: 



2/4- 1 

l «(r)=2^fi±i[e c ° re (rV 2 



47T 



(52) 



The smooth partial wave functions are chosen as 

3 



&(0 = X> r2i ' 



(53) 



i=0 



for r < r", and the coefficients, c;, are chosen so that <^ 
joins 4>% smoothly at r = r". The projector functions 
are calculated as 

Pni(r) = KV 2 + v(r) - 0)C,(r). (54) 

The projector functions must be orthonormalized as de- 
scribed by Blochl 2 ^. The potential v a is chosen so that 
the local effective potential, v = u H + V xc + v a , has the 
following shape for r < r" in the atomic reference state: 



v(r) = a a + b a r 2 , r < r%. 



(55) 



The constants a a and b a are found by requiring that 
u(r") = and dv(r)/dr\ r=r a = 0. 



B. Compensation charges 

For the compensation charges we use Gaussians: 
1 i\ 



4^(2^+1)! 



(56) 



and 



gl(r) = j_ 11 (4a a )*+ 3 /Ve- av ". |T»7i 



3tt(2*+1) 

With this choice for the compensation charges, the inte- 
gral, Vj££/, in Eq. JSTJl, can be evaluated analytically 5 ^. 
We choose the a's so that a a (r°) 2 = 9.0 and a a (f") 2 = 
22.0. 



XI. APPENDIX 

In the following, we provide explicit formulas needed 
for calculating the constants and functions that describe 
an atomic species. 



C. Atomic constants 

By inserting Eqs. J7J, © and {115} into Eqs. (O and 
(J2U we can reduce E a — E a to: 



11 



E 



«l!2 !l«2»3'4 !3*4 



Jfifaiai* = 3 / rf< 42 (r)[^ 3 (r)^ 4 (r) +C 3 (r)C 4 (r)], 



(63) 



E 



(64) 



+ E^^' M "' 

LL' 

+ A^ c ({L>? i2 }). (58) 

The integrals F\ G? il2 , Jg^, J£, K^ L and M£ L , 
are given below and need only be computed once for each 
type of atom. 

Inserting Eq. into Eq. (J5BJ leads us to Eq. (|3TJ, 

where 



,/r / dr'^^(r)^ 2 (r) (65) 



r - r 



and 



Nfr, = -h / dr / dr' 



'Li'--5 _/ - y - 'pT^(r). (66) 
The potentials i (r) and u"(r) are defined as 



v a ■ (r) 

41 12 V / 



^ cw^io-c^w (67) 



r — r' 



B': 



21*2*3*4 



--Jt 



*i*2 / y Ltiis I 
L 

A a Mf i42i00 + 2^A2 n42 iV VooA a 



- 7 a _l_ \ lU'a A a 



and 



h a c (r>) 



r — r 



(68) 



It is advantageous to decompose vf i2 (r) into angular 
momentum contributions as 



<i 2 (r)=E<^( r ) Y ^ r )> 



(69) 



LL' 



We use the symmetrized Cf , 



pa 



2 V iiiiizii 



«)• 



(59) 



(60) 



The integrals for F», G? l<a , I? lW4> J£, and 
M£ L , are (all integrals are limited to inside the augmen- 
tation spheres): 



pa 



core 

E 



dr</> I a ' corc (r)(- 



1 -r-j2 \ j a, core 



r) 



and solve 



V 2 [< l22 ,(r)r L (f)] = -AnG L LiL Y L (r) 



where 



^LiLs 



(70) 



/■27T 

sinflde / #r L (0,</))r Ll (0,^)y L2 (0,. 

o Jo 



(71) 



is a Gaunt coefficient. 



+ ± / drv a c (r)K(r)+h a c (r)] 



drfi a c {r)v a (r), 



I a 

n*2 



(61) 



1 / dr[^(r)V 2 C(r)-^(r)V 2 C(r)] 



+ / rfr[< i2 (r)n c (r)+<(r)^(r)<(r)] 



dr^(r)C(r)S a (r), 



(62) 



D. Hamiltonian 



The discretized Hamiltonian, Eq. depends on {if? 
and Hf i . The local effective potential on the coarse 
grid, u G = (Vf/V c ) J2 g IgGVg, is a restriction of the local 
effective potential on the fine grid: 



2^ 



(72) 



and the atomic Hamiltonian, Hf ^ ^ is 

^m 2 =E a w 3 ^ q + ^ £ 



(73) 
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where 

wt = — 

dQl 



9 9 
+ J2Y, V LL'Q a L'- (74) 



1 P. Hohenberg and W. Kohn, Phys. Rev. 136, B864 (1964). 

2 W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965). 

3 J. R. Chelikowsky, N. Troullier, K. Wu, and Y. Saad, Phys. 
Rev. B 50, 11355 (1994). 

4 A. P. Seitsonen, M. J. Puska, and R. M. Nieminen, Phys. 
Rev. B 51, 14057 (1995). 

5 T. Hoshi, M. Arai, and T. Fujiwara, Phys. Rev. B 52, 5459 
(1995). 

6 E. L. Briggs, D. J. Sullivan, and J. Bernholc, Phys. Rev. 
B 54, 14362 (1996). 

7 J. Bernholc, E. L. Briggs, D. J. Sullivan, C. J. Brabec, 
M. B. Nardelli, K. Rapcewicz, C. Roland, and M. Wensell, 
Int. J. Quantum Chem. 65, 531 (1997). 

8 F. Ancilotto, P. Blandin, and F. Toigo, Phys. Rev. B 59, 
7868 (1999). 

9 J. Bernholc, E. L. Briggs, C. Bungaro, M. B. Nardelli, J.- 
L. Fattebert, K. Rapcewicz, C. Roland, W. G. Schmidt, 
and Q. Zhao, Phys. Stat. Sol. 217, 685 (2000). 

10 T. L. Beck, Rev. Mod. Phys. 72, 1041 (2000). 

11 J. Wang and T. L. Beck, J. Chem. Phys. 112, 9223 (2000). 

12 M. Heiskanen, T. Torsti, M. J. Puska, and R. M. Nieminen, 
Phys. Rev. B 63, 245106 (2001). 

13 U. V. Waghmare, H. Kim, I. J. Park, N. Modine, P. Mara- 
gakis, and E. Kaxiras, Comp. Phys. Comm. 137, 341 
(2001). 

14 H. Takahashi, T. Hori, T. Wakabayashi, and T. Nitta, J. 
Phys. Chem. A 105, 4351 (2001). 

15 M. A. L. Marques, A. Castro, G. F. Bertsch, and A. Rubio, 
Computer Physics Communications 151, 60 (2003). 

16 R. Schmid, J. Comput. Chem. 25, 799 (2004). 

17 F. Shimojo, R. K. Kalia, A. Nakano, and P. Vashishta, 
Comp. Phys. Comm. 140, 303 (2001). 

18 Y. Liu, D. A. Yarne, and M. E. Tuckerman, Phys. Rev. B 
68, 125110 (2003). 

19 A. Brandt, Math. Comput. 31, 333 (1977). 

20 A. A. Mostofi, C.-K. Skylaris, P. D. Haynes, and M. C. 
Payne, Comp. Phys. Comm. 147, 788 (2002). 

21 G. Galli and M. Parrinello, Phys. Rev. Lett. 69, 3547 
(1992). 

22 S. Goedecker, M. Boulet, and T. Deutsch, Comput. Phys. 
Comm. 154, 105 (2003). 

23 P. E. Blochl, J. Chem. Phys. 103, 7422 (1995). 

24 G. J. Martyna and M. E. Tuckerman, J. Chem. Phys. 110, 
2810 (1999). 

25 D. Vanderbilt, Phys. Rev. B 41, 7892 (1990). 

26 K. Laasonen, A. Pasquarello, R. Car, C. Lee, and D. Van- 
derbilt, Phys. Rev. B 47, 10142 (1992). 

27 P. E. Blochl, Phys. Rev. B 50, 17953 (1994). 

28 P. E. Blochl, C. J. Forst, and J. Schimpl, Bulletin of Ma- 
terials Science 26, 33 (2003). 



29 N. A. W. Holzwarth, G. E. Matthews, R. B. Dunning, 
A. R. Tackett, and Y. Zeng, Phys. Rev. B 55, 2005 (1997). 

30 G. Kresse and D. Joubert, Phys. Rev. B 59, 1758 (1999). 

31 M. Valiev and J. H. Weare, J. Phys. Chem. 103, 10588 
(1999). 

32 A. R. Tackett, N. A. W. Holzwarth, and G. E. Matthews, 
Comput. Phys. Comm. 135, 348 (2001). 

33 G. Kresse and J. Furthmuller, Phys. Rev. B 54, 11169 
(1996). 

34 P. Pulay, Chem. Phys. Lett. 73, 393 (1980). 

35 D. M. Wood and A. Zunger, J. Phys. A 18, 1343 (1985). 

36 J. Fliege and U. Maier, IMA Journal of Numerical 
Analysis 19, 317 (1999), http://www.mathematik.uni- 
dortmund.de/lsx / research /projects /fliege /nodes /nodes. html. 

37 Although we find this method to be the simplest for GGA 
functionals, this is probably the most complicated part of 
the code. 

38 Y. C. Zheng and J. E. Almlof, J. Mol. Structure 388, 277 
(1996). 

39 C. Berghold, J. Hutter, and M. Parrinello, Theor. Chem. 
Acc. 99, 344 (1998). 

40 T. Ono and K. Hirose, Phys. Rev. Lett. 82, 5016 (1999). 

41 M. C. Payne, M. P. Teter, D. C. Allan, T. A. Arias, and 
J. D. Joannopoulos, Rev. Mod. Phys. 64, 1045 (1992). 

42 T. Korhonen, M. J. Puska, and R. M. Nieminen, Phys. 
Rev. B 54, 15016 (1996). 

43 J. A. White and D. M. Bird, Phys. Rev. B 50, 4954 (1994). 

44 J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. 
Lett. 77, 3865 (1996). 

45 F. W. Kutzler and G. S. Painter, Phys. Rev. Lett. 59, 1285 
(1987). 

46 S. Kurth, J. P. Perdew, and P. Blaha, Int. J. Quantum 
Chem. 75, 889 (1999). 

47 Y. Zhang and W. Yang, Phys. Rev. Lett. 80, 890 (1998). 

48 I. P. Daykov, T. A. Arias, and T. D. Engeness, Phys. Rev. 
Lett. 90, 216402 (2003). 

49 J. P. Perdew and Y. Wang, Phys. Rev. B 45, 13244 (1992). 
J ° S. R. Bahn and K. W. Jacobsen, Comp. in Sci. and Eng. 

4, 56 (2002). 

51 B. Hammer, L. B. Hansen, and J. K. N0rskov, Phys. Rev. 
B 59, 7413 (1999). 

52 The Dacapo code is freely available at 
http://www.fysik.dtu.dk/campos 

53 There are 2897 complex plane-wave coefficients, but only 
half of the coefficients need to be stored, because the wave 
function can be chosen as real. 

54 N. Troullier, J. R. Chelikowsky, and Y. Saad, Solid State 
Communications 93, 225 (1995). 

55 The code is written in the Python and C++ 
programming languages, and is freely available at 



13 



http: / /www. fysik.dtu.dk/campos/gridpaw 57 S. Obara and A. Saika, J. Chem. Phys 84, 3963 (1986). 

G. P. Kerker, Phys. Rev. B 23, 3082 (1981). 



